Advanced Dynamic Disk Memory Module

ABSTRACT

Memory modules address the growing gap between main memory performance and disk drive performance in computational apparatus such as personal computers. Memory modules disclosed herein fill the need for substantially higher storage capacity in end-user add-in memory modules. Such memory modules accelerate the availability of applications, and data for those applications. An exemplary application of such memory modules is as a high capacity consumer memory product that can be used in Hi-Definition video recorders. In various embodiments, memory modules include a volatile memory, a non-volatile memory, and a command interpreter that includes interfaces to the memories and to various busses. The first memory acts as an accelerating buffer for the second memory, and the second memory provides non-volatile backup for the first memory. In some embodiments data transfer from the first memory to the second memory may be interrupted to provide read access to the second memory.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation of non-provisional application Ser. No.11/453,293, filed 13 Jun. 2006, and entitled “Advanced Dynamic DiskMemory Module”, which claimed the benefit of previously filedprovisional application 60/690,451; filed 13 Jun. 2005; and entitled“Advanced Dynamic Disk Memory Module”, the entirety of both are herebyincorporated by reference.

FIELD OF THE INVENTION

The present invention relates generally to a plug and play end-useradd-in memory module for computers and consumer electronic devices, andmore particularly relates to methods and apparatus for providingadditional high performance memory resources for computational or mediasystems used to accelerate application launch and improve operationalperformance of application and data sets.

BACKGROUND

The concept of a RAM based disk substitute has been a part of thePersonal Computer (PC) for many years. There are many software programsthat set aside blocks of installed main memory for use as a temporarydisk partition. The effect of creating such partitions is intended toimprove the overall performance of the PC. One advantage of such aproduct is the increased speed at which a user can access a program ordata that is stored in the RAM-based disk partition. However, a drawbackof these products is reduced system performance when too much of themain memory is reserved for the RAM-based disk partition. In this case,insufficient scratch pad memory is available to hold the executingprogram and associated data. This reduction in available main memoryforces the PC to use the Hard Disk Drive (HDD) to extend the storagespace that it requires to run the application and access the data. Thisaction is commonly referred to as paging.

It is well-known that access performance of a HDD is lower than that ofmain memory. The performance degradation due to paging to the HDDrapidly overwhelms any performance gain from the use of a RAM-baseddisk. The performance degradation effects are further compounded insystems that share main memory for integrated graphics solutions (knownas Unified Memory Architecture (UMA)). The UMA graphics rely on sharingmain memory for the frame buffer and operational scratchpad in a mannersimilar to that of RAM-based disk products. Systems supporting RAM-baseddisks and UMA graphics have three sources competing for main memoryresources.

Most PC systems offer upgrade options to increase the amount of mainmemory via, for example, existing extra DRAM memory module connectors onthe motherboard. However, these extra connectors are usually difficultto access by the end-user, and in the case of many new systems may noteven be available at all.

What is needed is a product and a method for the PC end-user to addlow-cost high performance memory to their PC that improves the overallperformance of that personal computer with no impact to the main memoryresources.

SUMMARY OF THE INVENTION

Briefly, a memory module, in accordance with the present invention,provides the functionality of a RAM disk without incurring the mainmemory performance degradation associated with conventional RAM disks.Various embodiments of the present invention may be added to a systemvia internal connectors, or may be connected via readily accessibleexternal connectors as is described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a memory hierarchy diagram showing the relative position ofthe Advanced Dynamic Disk Memory Module in the overall Personal ComputerSystem memory hierarchy.

FIG. 2 is a Personal Computer (PC) system block diagram showing relativearchitectural location of key components and bus interconnects.

FIG. 3 is a PC System Block Diagram showing design configuration optionsfor the Advanced Dynamic Disk Memory Module (ADDMM).

FIG. 4 illustrates the ADDMM module functional block diagram showing thekey interfaces and components of an exemplary memory module.

FIG. 5 is a detailed functional block diagram of an exemplary ADDMMcontroller identifying most of the major interfaces and control blocksfor the controller.

FIG. 6 is an illustrative high-level operational flow chart for ADDMM.

FIG. 7 is a partition and address lookup table operational flow diagramshowing a mechanism used to direct Data requests to the correct locationin ADDMM.

FIG. 8 is an example of a memory address map for a single partitionmemory module.

FIG. 9 is an example of a memory address map for a two partition memorymodule.

FIG. 10 is an example of a memory address map for a four partitionmemory module.

FIG. 11 is an illustrative functional flow diagram for the addresslookup tables used in an ADDMM controller.

FIG. 12 is an illustrative flow diagram of a DRAM ECC operation.

FIG. 13 is an illustrative flow diagram of a FLASH ECC operation, onerequired per channel of FLASH supported.

DETAILED DESCRIPTION

Reference herein to “one embodiment”, “an embodiment”, or similarformulations, means that a particular feature, structure, operation, orcharacteristic described in connection with the embodiment, is includedin at least one embodiment of the present invention. Thus, theappearances of such phrases or formulations herein are not necessarilyall referring to the same embodiment. Furthermore, various particularfeatures, structures, operations, or characteristics may be combined inany suitable manner in one or more embodiments.

Terminology

The terms chip, integrated circuit, semiconductor device, andmicroelectronic device are sometimes used interchangeably in this field.The present invention relates to all of the foregoing as these terms arecommonly understood in the field.

System performance of a personal computer (PC) may generally be improvedby adding high performance memory. This is usually done by adding mainmemory or replacing an existing memory module already in the PC with ahigher capacity memory module. Embodiments of the present inventiontypically include high speed memory, such as DRAM, and/or a combinationof DRAM and slower writeable non-volatile memory such as FLASH. In theDRAM and FLASH configuration the DRAM acts as a buffer for the FLASHdevices. The amount of DRAM and the amount of FLASH on this memorymodule is configurable at the time of manufacture.

In a further aspect of the invention, the memory module capacity isconfigurable by the PC to optimize performance for any given usagemodel. This configuration is called partitioning. A DRAM onlyconfiguration for this memory module may be used by the system undercertain power conditions and power managed states. It is known that ifpower is interrupted volatile memory devices lose the integrity of theircontents and hence must be reloaded with the appropriate application anddata by the driver software or by the operating system. A combinationDRAM and FLASH memory module can maintain this information indefinitelyin the non-volatile FLASH devices. The applications and data will bemaintained during any of the power-managed states including thepower-off state of the system. Some examples of those states arefull-on, standby, hibernate and power-off. Specialized driver software,or the operating system (OS), is responsible for managing andmaintaining coherency of the applications and data stored on this memorymodule.

In one embodiment of the present invention, the memory module is ahybrid module including DRAM and FLASH memory. This configurationprovides a desirable trade-off for a cost-effective high-performancesolution in the PC environment. In this configuration the DRAM is usedin a number of ways to improve system performance. First a portion ofthe DRAM space, or all of the DRAM space, can be allocated as a writebuffer to the FLASH memory devices. By buffering the write traffic tothe FLASH devices in the DRAM, the ExpressCard interface can be freed upfor the next request in the pipeline. In this module the DRAM can alsoused as a read buffer where the most recently used data and applicationsare temporarily stored. The DRAM in this module allows for the PCIExpress bus and the USB bus in the Express Card interface to run at fullread and write bandwidths. Without the DRAM, the FLASH devices cansustain only a portion of the available interface read bandwidth and asmall portion of the write bandwidth.

The DRAM also plays an important role in reducing the amount of randomwrite traffic to the FLASH memory. Reducing the number of Erase/Writecycles to the FLASH devices is very important in managing the lifeexpectancy of the FLASH memory. FLASH memory has a limited number ofErase/Write cycles before device failure is expected. This is known asFLASH endurance. FLASH endurance is an important parameter when usingFLASH as a storage medium in PC systems.

Various memory modules in accordance with the present invention includememory that is used for creating a dedicated RAM-based disk (hereinafterreferred to simply as RAM disk) for a personal computer system withoutconsuming main memory resources. Since extra memory is added to thesystem, the system enjoys a performance advantage without the negativeimpact caused by consuming main memory to support a RAM disk.Alternatively, such a module may also operate in conjunction with anexpandable system bus whereby a plurality of such modules may beinstalled.

Another issue in using conventional RAM disk is that it relies on thevolatile main memory to store the application and or data image. Ifpower is lost, or the system goes through a hard re-boot, the data islost and the operating system or custom driver is required to restorethe memory to a known good state. The re-loading process is often a slowand frustrating event when using a personal computer. Variousembodiments of the present invention overcome these problems byincluding non-volatile and/or semi-volatile memory which substantiallyreduces or eliminates the problems associated volatile DRAM-based RAMdisks at a cost and performance level substantially better than that ofFLASH only devices.

The memory hierarchy of FIG. 1 shows where ADDMM fits in the PCperformance/capacity continuum. Modules in accordance with the inventionfill the performance gap between main memory and the Hard Disk Drive(HDD). In order to fill that gap effectively the ADDMM uses both DRAMand FLASH to deliver the targeted performance and capacity. ADDMM is acost-effective end-user add-in memory module that is pluggable in thesystem much like existing compact FLASH card readers and SD media cardreaders.

The block diagram of FIG. 2 shows the elements of a conventional PC withthe various interfaces and components. The data in TABLE 1 shows thevarious usage models and where the interface points are. It is notedthat there are PCI Express interfaces on both the North Bridge and SouthBridge (i.e. chipset components). It is further noted that the USB busis on the South Bridge along with the Serial ATA (SATA) interface. TheExpressCard interface used with embodiments of the present inventionincludes both a single PCI Express channel and a single USB Channel.Various embodiments work with both the PCI Express interface and the USBinterface together under certain configurations. Various embodiments mayalso work with only one of the interfaces under certain otherconfigurations. The location of the interface is due to existing systemarchitectures but is not meant to be a limitation on the invention (seeFIG. 3). The only requirement is that the embodiments be attached to atleast one of the PCI Express or SATA interfaces. USB alone is an optionbut not the best interface for the preferred invention. Another point tonote is that the closer the interface attach-point is to the CPU, thebetter the overall performance the PC realizes from use of the presentinvention.

TABLE 1 Interface Usage Options Usage Option PCI Express USB SATA AttachLocation Interface usage #1 Yes Yes No South Bridge Interface usage #2Yes No No North or South Bridge Interface usage #3 No Yes No SouthBridge Interface usage #4 No No Yes South Bridge

The block diagram of FIG. 3 shows five possible system configurationsregarding where embodiments of the present invention may be used. Eachof the configurations provides a different trade-off with respect to itsown advantages, risks, performance, features and cost/complexity. Theonly new location being identified in FIG. 3 over existing systems isthe possibility that either a PCI Express or a SATA interface may beadded to the CPU.

Referring to FIG. 3, various alternative connection architectures, ordesign attach points, are described. Design Attach Point #1 shows ADDMMattached to the SATA interface. This location provides for the highestlevel of system interaction utilizing existing interfaces and softwareto manage the interface with existing disk drives such as RAID protocolsand real-time disk coherency. The location and protocol add latency tothe interface but provides a performance enhancement over what isprovided by a HDD.

Design Attach Point #2 shows ADDMM on the ExpressCard interface. TheExpressCard is an industry standard interface that has a high degree ofhardware and software support and is currently defined for PCIe Gen1 andUSB 2.0 operation. Some existing disk software infrastructure is lost,as compared to Design Attach Point #1, but some performance enhancementsmay be gained through improved latency due to locality relative to theCPU on some PC designs.

Design Attach Point #3 shows an ExpressCard or a SATA interface on theNorth Bridge controller chip. This option shows a location that reducessystem latency by eliminating the command transit through the SouthBridge. The value this option provides is improved performance throughreduced latency, and the opportunity to be used as part of UMA or mainmemory through software partitions.

Design Attach Point #4 shows a PCI Express interface re-using theGraphics interface port. This gives up to 16 PCIe channels for asubstantial bandwidth improvement over a single PCIe and/or 3 to 4 SATAchannels. This interface is also one of the lowest latency available inthe PC, and exists on almost all present day PCs.

Design Attach Point #5 shows an ExpressCard or SATA interface at theCPU. This eliminates one more level of latency for the ADDMM. Thelatency through this connection is similar to that of the main memory asthere is only one controller in which the requests are routed. It isnoted that this attach point does not eliminate the need for main memorysince the bandwidth available in main memory is significantly higherthan that of the ADDMM.

It is noted that the present invention is not limited to theabove-described interfaces, and various embodiments of the presentinvention may be used with any suitable alternative memory systemarrangement or interface.

Three illustrative configurations of the invention, 1) DRAM only; 2)FLASH only; and 3) both DRAM and FLASH; are discussed below.

The DRAM only module-configuration delivers the highest performance forany of illustrative module configurations. A DRAM-only solution haslimitations for usage during power-managed states since the DRAM isvolatile. The DRAM-only module is not expected to maintain data if poweris lost to the memory module unless an auxiliary power source ispresent. Because the DRAM loses data without an auxiliary power source,it is secure from tampering and secure from data theft when the moduleis removed from the system. The DRAM-only module may support multipleselectable partitions. Each partition can be treated equally and/orindependently depending on usage module and system requirements.

The FLASH-only memory module configuration has better performance thanexisting USB FLASH drives but does not offer substantial performancebenefits. This is not expected to be a typical configuration for theADDMM, however it is the lowest cost per bit capacity design.Performance of this configuration is expected to improve as FLASHtechnology improves. The FLASH-only configuration may support multiplepartitions.

The DRAM and FLASH combination configuration is a compromise designdelivering good performance and high capacity at a market-acceptableprice point. This configuration can take full advantage of theExpressCard interface performance much like the DRAM-only configuration.It has FLASH memory to provide non-volatile storage to better handlepower-managed states and to provide adequate capacity. In operation,typical embodiments will not retain data in the DRAM during hibernateand power-off states, and/or in the event of a power loss, unless anauxiliary power backup is provided. The auxiliary power backup, ifprovided, is preferably sized to allow for the flushing of the DRAMcontent to FLASH in order to prevent data loss in case of an unplannedpower interruption.

The disk drive and memory parameters shown in TABLE 2 highlightdifferences between solid state memory and hard disk drives with regardto latency and bandwidth. It is noted that the preferred DDR2 type DRAMhas bandwidth that is at least 2× that of the presently availableinterfaces in present PC systems to which it may be connected. It isfurther noted that various types of FLASH, e.g., the NAND flash memory,has lower read and lower write bandwidth than the presently availableinterfaces in PCs.

Referring to FIG. 4, a memory module functional block diagram inaccordance with the present invention illustrates a DRAM interface andtwo interface channels for FLASH. Two FLASH channels are defined inorder to increase the read bandwidth to an acceptable level. Doublingthe FLASH interface width does improve write bandwidth but it remainssubstantially lower performance than the available interfaces,particularly the multi-level (ML) FLASH technology. Write bandwidth is aknown bottleneck in the overall PC system performance when using FLASHmemory. The DRAM is used as a write buffer to dramatically reduce thewrite bandwidth impact to system performance.

TABLE 2 Disk Drive & Memory B/W & Latency Parameters Disk Avg AvgInternal External Seek Latency Spindle Ultra ATA 100 Mb/s 100 MB/s/chnow 12.5 5.56 ms 5400 rpm SATA I 1.2 1.5 Gb/s 150 187 MB/s/ch now 8.54.16 ms 7200 rpm SATA II 2.4 3.0 Gb/s 300 375 MB/s/ch 2005 SATA III 4.86.0 Gb/s 600 750 MB/s/ch 2007 PCle Gen1 2.0 2.5 Gb/s 250 313 MB/s/chPCle Gen 2 4.0 5.0 Gb/s 500 625 MB/s/ch USB 2.0 480.0 Mb/s 60 MB/s/chRaw Serial BW Sustained BW Latency Memory x8 rd wr rd wr tRCD tRP tRCAvg DDR2 667 5.3 Gb/s 667 MB/s/x8 now 12 12 54 ns DDR2 800 6.4 Gb/s 800MB/s/x8 2005 12 12 54 ns DDR3 1033 8.3 Gb/s 1033 MB/s/x8 2007 11 12 54ns DDR3 1333 10.7 Gb/s 1333 MB/s/x8 2008 10 12 53 ns Nand FLASH 264  56Mb/s 33  7 MB/s/x8 25 us NOR FLASH 608 72/1 Mb/s 76 9/.133 MB/s 110 nsOne-Nand FLASH 544 144 Mb/s 68 18 MB/s 30 us 2.5″ HDD 640 512 Mb/s 80 64MB/s 5.6 ms uDrive 56  56 Mb/s 7  7 MB/s 8.33 ms Assumptions ControllerLatency 40.0 ns

Using simple calculations and adding estimates for controller latency itis apparent that a memory device, that is neither a disk cache nor partof main memory, can add substantial performance to the system. Bandwidthis limited by the interface that is used to add this memory to thesystem. Available bandwidth also translates into access latency. If theavailable bandwidth is less than the interface bandwidth then the timeit takes for the requested data to be read from, or written to, thememory will increase thus reducing system performance. TABLE 2 outlinesthe BW available by the different interfaces currently available in thesystem that is used for the calculation.

The benefit of using embodiments of the present invention can be seenfrom the following:

Average Disk Latency=4.16 ms (See TABLE 2)

Average Memory Latency=214 ns

-   -   =54 ns+60 ns (transit time)+100 ns (arbitration)

Latency Improvement:

-   -   Average Disk Latency/Average Memory latency=Speed up    -   4.16 ms/214 ns=19,440× improvement        Even if there is excessive latency under worst case memory        access, assuming 5× worse, which is 1.07 us, the improvement        would be >3500X over existing disk drives.

Latency is impacted by the bandwidth of various embodiments. The time ittakes to get the data from the memory module into the system adds to thelatency of the access. With embodiments having the above-described mixof DRAM and FLASH, the interface will be the performance limiter inpresently available systems. As shown in TABLE 2, the preferred DRAMmemory has 2× to 3× the bandwidth than the best external interface todaycan consume with just a single 8 bit wide DRAM device. X16 DRAM devicescan deliver 4× to 6× the necessary consumable bandwidth for theinterface. This extra available bandwidth is important when looking atthe interaction of the DRAM with the interfaces and with the FLASH. Theadded bandwidth allows the DRAM to concurrently service all availablesystem interfaces to the memory module and the FLASH memory devices asthe write buffer to those FLASH devices. The extra available bandwidthfrom the DRAM devices permits reduction of the operating frequency ofthe DRAM devices thus making the memory module easier to design,improving the robustness of the memory module design, and typicallyreducing costs by using the lowest cost DRAM components.

Controller Operation

Various embodiments of the present invention include a controller toprovide the functionality described herein. The controller functionalblock diagram in FIG. 5 shows the major elements, includingsub-controller blocks, of the controller. In other words, the controllerfor the memory module in accordance with the present invention includesvarious logical blocks, which may also be referred to as sub-controllerblocks, that control the operation of the various interfaces andfeatures. An ExpressCard interface refers to a well-known and definedindustry standard that is comprised of one PCI Express interface, oneUSB interface and one SMBUS interface. The primary operationalconnection is through the PCI Express interface. The USB interface canalso be used under certain configurations and circumstances but itoperates at a much lower performance point than does the PCI Expressinterface. The SMBUS interface is used for intra-system communication.The SMBUS controller block primarily manages information from a set ofinternal registers that are used for configuration support and memorymodule status. Those skilled in the art and having the benefit of thisdisclosure will appreciate that various logic and/or circuit blocks usedfor implementing the various “standard” interfaces may be, but are notrequired to be, re-used from pre-existing chip designs.

In one embodiment a display controller block, such as shown in FIG. 5,is provided to support pre-defined, or predetermined, visual indicators.Since there may be multiple memory modules in accordance with thepresent invention installed in a PC, and each of the instances may beused differently, it may be valuable to the user to have a visualindicator to identify the function of each of the memory modules. Forexample, control logic for driving light emitting diodes (LEDs) orsimilar visual indicators, can be included within the controller of thememory module. Such control logic would activate one or more LEDs, orLEDs or different colors to indicate to a user how the memory module isconfigured and/or to what interfaces it is coupled. Those skilled in theart and having the benefit of this disclosure will appreciate thatvarious other indicator schemes may be provided within the scope of thepresent invention.

An illustrative power management controller block, such as shown in FIG.5, is configured to operate in accordance with the industry standardACPI Power Management specification. Part of this power managementcontroller block is the option to build into embodiments of the presentinvention a voltage regulator (VR) controller for the module. Preferablythis VR controller would be used to regulate and deliver all of thevoltages needed by various embodiments.

An illustrative DRAM controller block, such as shown in FIG. 5, isconfigured to operate with DDR2 DRAM. Embodiments of the presentinvention are accessed differently than the main memory in a PC. Mainmemory in a PC typically accesses data in as small as a single DRAM reador write, which may be, for example, four words (64 bits) of data.

Often a system request requires only a portion of the retrieved data forthe given application. The preferred invention is expected to haveminimum access requirements of 512 Bytes of data and typically willrequire minimum data sets as large as 4K Bytes of data and possiblelarger in the future. Due to the larger granularity of accessrequirements the DRAM controller block configured to operate with DDR2DRAM can be simplified.

In some embodiments, an error correction (ECC) engine may be included inthe DRAM controller block to ensure data integrity on the memory module.This ECC engine is configured specifically for the DRAM. It detects andcorrects data failures caused by soft errors and by hard errors at readtime due to a failing memory bit or bits in the DRAM. The failedlocation information is used to update the available memory location inthe memory allocation tables.

The ECC engine operates as shown in FIG. 12. A read request is sent tothe DRAM. The data is returned from the DRAM and checked for errorsunder control of the ECC engine. If an error is detected but the errorcorrection is disabled, then an error notification is sent to therequesting interface in place of the data expected. The failed locationis then disabled by moving the address of that location to a failed map.The routine then returns to the Idle state and waits for the next readrequest. If the ECC engine does not detect an error after a DRAM read,then data is returned to the requesting interface and the ECC enginereturns to idle to wait for the next read request. If the ECC enginedetects an error and the correction function is enabled, then data iscorrected, and forwarded to the requesting interface. After forwardingcorrected data to the requesting interface, the ECC controller checks tosee if that location had previously failed. If the location had notpreviously failed, then a failed flag is set and the corrected data isre-written to the DRAM. If the failed flag had been previously set, thenthe data is written to a new location in the DRAM, and that data is thenverified for correctness. If the data is verified, then the memoryaddress map is updated with the current data status and the ECC enginereturns to idle. If data is not verified, then an error report is issuedto the requesting interface indicating that a data error has occurredand that the data was corrected but can not be saved in a known goodstate. It is then up to the user to intervene regarding appropriateactions to take.

A flash controller block, as shown in FIG. 5, is also necessary. Theillustrative embodiment of FIG. 5 is configured to support two channelsof FLASH operating in lock step to double the available FLASH bandwidth.There are many issues when operating two channels of FLASH in lock stepthat must be managed. FIG. 13 shows some of the basic issues that arerequired to deal with in running dual channels of FLASH. First reads andwrites are handled somewhat differently. The write cycle in a dualchannel environment is the most familiar and is not significantlydifferent from that of a single channel FLASH write. The mainoperational difference to manage is that each channel is operatedindependently and when writing to a specific Logical Block Address (LBA)location each channel most probably will be pointing to physicallydifferent locations. Once the write sequences are started, the Flashcontroller block needs to wait until both write sequences are completeprior to returning to Idle to start another operation. In any givenembodiment, depending on how large a buffer is included in theembodiment, it is possible for some controller blocks to start anotherwrite cycle on the FLASH channel that has completed its write sequencewithout having to wait for the write cycle on the second channel tocomplete its write cycle. If a read were to be requested in the middleof a write cycle and that read hits a write in progress but notcompleted the read must wait until the write cycle is completed unlessthe full read content is available in the buffer. If the read content isavailable, then the read data can be returned without waiting for thewrite cycle to complete. If the buffer contents were invalidated and theread data is only available in the FLASH devices, then the FLASH writecycle must be completed prior to issuing a read access to the FLASHdevices.

A FLASH read request in a multi-channel FLASH environment departssignificantly from access requests to a single channel FLASH readenvironment. FIG. 13 shows read request initiates a parallel read fromboth channel one and channel two. This read assumes that the read couldnot be serviced from the internal buffers or from the DRAM. Read datafrom each channel is sent to an alignment buffer. This alignment bufferis required due to the read nature of the FLASH devices. A consistentread latency is not guaranteed by each flash device in any givenchannel. The latency is dependent on several different factors in theFLASH devices. The primary difference is if the devices are single levelstorage elements or multilevel storage elements and what state and wherethe data is stored. This is much more complicated than operating usingmultiple channels of DRAM which all have deterministic latencies. Thealignment buffer needs to be sized large enough to hold at least oneaccess of each of the FLASH channels in order to align the data. Abuffer size consisting of many FLASH data reads may be required tooptimize system performance. The size of this buffer for optimalperformance is dependent on data set request sizes and the nominallatency delta between the FLASH channels. Once data has been collectedof the appropriate data set size, as requested by the host interface,and for simplicity we will use an 8 Byte request, and that 8 Bytes isready that read data is queued to be sent to the requesting interface.Once the data is sent to the host queue the state machine loops untilthe full data request is returned and then returns to idle to wait forthe next command.

An ECC engine is also required for the FLASH controller block, as shownin FIG. 14. This ECC engine is not significantly different from the oneused for the DRAM ECC engine, however an ECC engine is required perchannel of FLASH being implemented. The controller operation may besubtly different due to the operational differences between DRAM andFLASH and the differences in the ECC algorithm used by each memory type.Each FLASH channel must be treated as independent even if they arelinked by common data set. The specific ECC algorithms used by DRAM andby FLASH are significantly different. The differences are driven by thefailure mechanisms and frequencies of expected failures in each memorytype.

The functional block that ties the overall controller together is shownin FIG. 5 as the command manager data router address lookup tablesblock. This block is responsible for the interpretation of the commandsas they are received from the PCI Express interface or from the USBinterface. It then acts on those commands issuing data read, data write,and status request commands to the appropriate control blocks. Thefunctional operation is shown in FIG. 6 “Functional Flow Diagram”. Forease of explanation, the PCI Express interface is the described example.The USB interface would work similarly to the PCI Express interface.

Still referring to FIG. 6, during initialization of the system the PCIExpress and USB interfaces are polled to identify add-in cards that maybe present in the system. During that polling the ADDMM memory moduleidentifies itself as an ATA compliant mass storage device to the host. Aseparate partition on the ADDMM memory module may be defined as a BulkOnly Transport (BOT) mass storage device on the USB interface. There isan industry standard limitation that any given partition can only beidentified for use on one interface. This means that if the ADDMM memorymodule is configured with a single partition and is identified as a PCIExpress interface client it can not be also be defined as a USB client.If, however, the ADDMM memory module is configured as two partitions,one partition can be mapped to the PCI Express Interface and the secondpartition can be mapped to the USB interface. This flexibility in theADDMM memory module controller allows the module to be optimized formaximum throughput in a system and optimized to better support a user'sneeds.

As shown in FIG. 6, a system request is sent to the PCI ExpressInterface and is identified as an ADDMM operation. The ATA commandinterpreter then determines what action to take. There is one of fivebasic actions that can be initiated.

The first possible action is to return back to idle to wait for anothercommand. This action is taken if the command is determined to not be forthe ADDMM memory module or the ADDMM memory module does not recognize itas such. In the event that the command was targeted to the ADDMM memorymodule and it could not recognize it as a valid request an errorresponse may be initiated by the error handler.

The second possible action is that in certain circumstances the ATAcommand interpreter will return an error message to the host indicatingthat the command was targeted towards the correct location but due tosome fault it is not able to respond.

The third action is a status request. The ATA command interpreter thenlooks up the appropriate data from the status registers and or performsa status check and then returns the appropriate status and response tothe system. There may be some instances when performing a status checkthat appropriate status cannot be returned. The request is marked asfailed and passed to the error handler. Once past the status check alookup is performed and the response returned to the host.

The fourth possible action to respond to is a data read request. Theread request is passed on to the lookup table where it is determinedwhether to get the data from the FLASH or from the DRAM. If the data isto be retrieved from the DRAM, then a read command with a length requestis sent to the DRAM memory controller and then the data is returned tothe host associated with the read request. If the data is to be returnedfrom the FLASH memory, then a command is sent to the FLASH controllerwith a length of data to retrieve. While retrieving the FLASH data aflag may have been set that would cause the controller to not onlystream the data to the host but to also copy that data to the DRAM. Ifthe data is copied to the DRAM, then a CACHE flag is set in the lookuptable to direct future accesses for this data to the DRAM. To preventoperational problems if there is a power loss event of any type, thelookup tables keep pointers back to the FLASH memory locations in theevent data is needed by the host but cannot be retrieved from the DRAM.This data redundancy is managed for memory space up to the size of theavailable DRAM. Once the data has been returned to the host and copiedto the DRAM, in the event that the DRAM flag was set, the controllerwill then return to idle and wait for another command. If an error isdetected during the lookup, then the error is flagged and passed to theerror hander.

Still referring to FIG. 6, the fifth action is a data write request. Thephysical write address is translated through the lookup table. If theADDMM memory module does not have DRAM installed, then data is writtendirectly to the FLASH. There is a small write buffer in the controllerbut it is not large enough to sustain multiple back-to-back writerequests at full interface bus speeds directly to the FLASH. If DRAM isinstalled on the ADDMM memory module, then data is written to the DRAMregardless of FLASH availability unless it gets flagged for writedirectly to flash only. Write direct to FLASH is a special writesequence targeted for very unique functions and is not expected to beused in normal operations, although it may be supported by variousembodiments. Special commands may be recognized by some embodiments toallow a direct to FLASH write for optimized system operation. Once thedata is written to the DRAM, the controller marks the data as ready forwriting to the FLASH. The FLASH write manager (as shown in FIG. 15) isthen responsible for writing the DRAM data to FLASH. The FLASH writemanager is also responsible for writing any data that may be placed inthe FLASH write buffer regardless of the data's origin. The FLASH writemanager determines if a data write to flash is ready. It then copies theDRAM data to the FLASH write buffer. Once the FLASH write buffer isfilled with the appropriate data, a FLASH write sequence is started. TheFLASH write manager then writes the data to FLASH when the FLASH isready to accept it. There is a timer that can be configured for writingto the FLASH. If the FLASH is not ready to be written to prior to thetimer timing out the controller flags an error to the error handler andreturns the appropriate error message to the Host. If FLASH memory ispresent and it is ready to begin the very slow process of writing thedata to FLASH, then the FLASH write sequence is started. It is expectedthat the FLASH write operation can and most likely will be interrupteddue to possible read or status activity that must be performed by thehost. Since FLASH write data is buffered by DRAM, the write cycles canbe stopped to service a critical request. The DRAM to FLASH writemanager determines if the FLASH write buffer is ready to accept writedata from the DRAM. When the FLASH write buffer is ready, data is readfrom the DRAM and put into the buffer. If the buffer is not ready, thenthe controller waits until it is. If the write timer times out, then anerror is flagged and passed on to the error handler. Once the FLASHwrite buffer has valid contents for a write to FLASH, the FLASH memoryis checked to see if is ready to start a write cycle. If the FLASH isnot ready, then the controller waits until it is. If the wait timertimes out then an error has occurred and the information is passed tothe error handler. Once the FLASH is ready to accept a write, data iswritten from the write buffer to the FLASH. This sequence of checkingFLASH ready and writing from the FLASH write buffer will continue untilthe last data from the buffer is written at which time the controllerreturns to the idle state to wait for the next write command. Once thewrite to FLASH is complete for each of the logical blocks, the addresslookup tables are updated with the new location information. If thewrite cycle from DRAM to flash is interrupted prior to completion, thenthe lookup tables will not be updated until the write sequence iscomplete for each logical block address. There are only two activitiesthat can interrupt a write cycle to FLASH, a FLASH read request; and aFLASH status request. If a Write request is issued to FLASH that isrequired to bypass the DRAM memory that write can not be completed untilthe write sequence or sequences are completed from the DRAM to theFLASH. This is a very special case and should rarely if ever occur.

A critical operational element of the memory module controller is howthe lookup tables work and how they are managed. FIG. 7 shows how a PCIExpress command is parsed by the ATA command interpreter and the flow ofthe lookup and movement of the physical address to the FLASH memoryrequest queue or the DRAM request queue. The ATA command interpreterstrips out the partition enable and the starting logical block address.The partition information is used to enable the block of memory that theLBA will be used to identify the physical address for either the DRAM orthe FLASH location in which to read data from and/or write data to. Thephysical address is then attached to the command in the FLASH requestqueue or in the DRAM request queue. In certain circumstances nopartition may be identified associated with the request. Under thiscondition the lookup table controller will send an error report to thehost. There are also certain conditions that may exist where the lookuptable returns an invalid physical address from the LBA and partitioninformation. If this happens an error is returned to the host throughthe error handling manager.

FIG. 11 shows how the Lookup tables are managed. The lookup tables usethe partition information to select which block of lookup table tooperate from. It then uses the logical block address information tolookup a physical location associated with the partition informationwhose content is used to identify the physical location of the databeing requested. TABLE 3 shows how the lookup table entries areallocated. The lookup tables contain specific pointers to memory typesavailable and to the locations where information is stored.

TABLE 3 Lookup Table Entry Allocation Map Memory Type Physical LocationDRAM FLASH Physical Physical Physical 1 = Yes 1 = Yes ADDRESS ADDRESSADDRESS 0 = No 0 = No DRAM FLASH CH1 FLASH CH2 (PAD) (PAF1) (PAF2)

The physical address for the DRAM and each channel of FLASH is managedseparately. Because of FLASH specific issues surrounding endurancemanagement and because some of the usage models allow for DRAM tooperate independent of the FLASH, each channel of the FLASH device needsto be managed independent of the other. An example of this is shown inFIG. 10 where partition #1 is a DRAM only partition and partition #2 isa DRAM and FLASH partition and partitions #3 and #4 are FLASH onlypartitions. In this example, the partition table entries would look likethe entries shown in TABLE 4.

TABLE 4 Example Lookup table mapping for FIG. 10 DRAM FLASH ADDRESSPartition #1 1 0 PAD n.a. n.a. Partition #2 1 1 PAD PAF1 PAF2 Partition#3 0 1 n.a. PAF1 PAF2 Partition #4 0 1 n.a. PAF1 PAF2

Various embodiments of the present invention support data encryption anddigital rights management. Data encryption is primarily handled by theoperating system and has minimal impact on embodiments of the invention.In various embodiments digital rights management may conform to industrystandard requirements.

The built-in self test (BIST) controller shown in FIG. 5 is typicallyconfigured specifically for each embodiment. Its primary function is tobe used to reduce test time during manufacturing and assembly. The BISTcontroller is not generally needed subsequent to the testing proceduresof the manufacturer.

CONCLUSION

Embodiments of the present invention find application in PC systemswhere system performance can be enhanced by adding memory in the PCsystem memory hierarchy that is substantially better performance thanconventional hard disk drives, and better performance than conventionalUSB FLASH drives.

Embodiments of the present invention can also find application inconsumer electronics products where media content (e.g., video, audio)is desired. Examples of such consumer electronics products include, butare not limited to, video camcorders, televisions, personal displaydevices, gaming consoles and personal gaming devices.

The advantage of some embodiments of the present invention allow for theincreased performance of a PC platform, including the reduction of powerin a PC platform and the extension of life for the hard disk drives in aPC platform.

It is to be understood that the present invention is not limited to theembodiments described above, but encompasses any and all embodimentswithin the scope of the subjoined Claims.

1. A memory module; comprising: a first memory comprising a volatilememory; a second memory comprising two or more non-volatile memorydevices; and a controller coupled to the first memory and the secondmemory, the controller comprising: a command interpreter; one or morebus interface controller blocks coupled to the command interpreter, eachof the one or more bus interface controller blocks further coupled to acorresponding one of one or more bus interfaces; a first memorycontroller block, coupled to the command interpreter, for communicatingwith the first memory; and a second memory controller block, coupled tothe command interpreter, for communicating with the second memory, suchthat a first one of the two or more non-volatile memory devices iscoupled to a first channel interface of the second memory controllerblock and a second one of the two or more non-volatile memory devices iscoupled to a second channel interface of the second memory controllerblock; at least one configuration register, the at least oneconfiguration register coupled to the command interpreter; wherein thememory module is adapted to physically and electrically couple to acomputer system having a main memory, receive and store data from thecomputer system, and retrieve and transmit data to the computer system;wherein neither the first memory or the second memory form part of themain memory; wherein the command interpreter receives commands from thecomputer system; wherein the controller is operable to receive and storememory partition configuration information of the memory module; whereinthe second memory controller block includes at least one alignmentbuffer to receive and store read data from each of the first and thesecond non-volatile memory devices, which read data arrives at thealignment buffer responsive to simultaneous read requests from thesecond memory controller block; and wherein the read data from each ofthe first and the second non-volatile memory devices arrives at adifferent time.
 2. The memory module of claim 1, wherein a first memorypartition includes volatile memory only, a second memory partitionincludes volatile memory and non-volatile memory, and a third memorypartition includes non-volatile memory only.
 3. The memory module ofclaim 1, further comprising a plurality of look-up tables coupled to thecommand interpreter.
 4. The memory module of claim 1, wherein dataimages are stored for accelerated access.
 5. The memory module of claim1, wherein an OS boot image from a power-off or hibernate state isstored for use to improve system boot time from a power-off or hibernatestate.
 6. The memory module of claim 1, wherein the volatile memory is aDRAM memory, and the non-volatile memory is a Flash memory.
 7. Thememory module of claim 1, wherein the command interpreter is operable toreceive an access request, determine the type of access, lookup firstmemory addresses and second memory addresses, send the access request toa first memory access queue if the first address is valid and the sendthe access request to a second memory access queue if the first addressis not valid; and wherein the access request is selected from the groupconsisting of read access request and write access request.
 8. A memorymodule; comprising: a controller comprising: a command interpreter; oneor more bus interface controller blocks coupled to the commandinterpreter; a first memory controller block, coupled to the commandinterpreter; a second memory controller block, coupled to the commandinterpreter, the second memory controller block having a first channelinterface, a second channel interface, and an alignment buffer coupledto each of the first and second channel interfaces; at least oneconfiguration register, the at least one configuration register coupledto the command interpreter; a first memory comprising a volatile memory,the first memory coupled to the first memory controller block of thecontroller; a second memory comprising two or more Flash memory devices,a first one of the two or more non-volatile memory devices coupled tothe first channel interface, and a second one of the two or morenon-volatile memory devices coupled to the second channel interface andwherein the memory module is adapted to physically and electricallycouple to a computer system having a main memory, receive and store datafrom the computer system, and retrieve and transmit data to the computersystem; wherein the controller is operable to receive and store memorypartition configuration information of the memory module; wherein thealignment buffer is operable to receive and store read data from each ofthe first and the second non-volatile memory devices, which read dataarrives at the alignment buffer responsive to read requests from thefirst and second channel interfaces of the second memory controllerblock; and wherein the read data from each of the first and the secondnon-volatile memory devices arrives at a different time.
 9. The memorymodule of claim 8, wherein the volatile memory comprises DRAM and thenon-volatile memory comprises Flash.
 10. A memory module; comprising: acontroller comprising: a command interpreter; one or more bus interfacecontroller blocks coupled to the command interpreter; a first memorycontroller block, coupled to the command interpreter; a second memorycontroller block, coupled to the command interpreter, the second memorycontroller block having a first channel interface, a second channelinterface, and an alignment buffer coupled to each of the first andsecond channel interfaces; at least one configuration register, the atleast one configuration register coupled to the command interpreter; asecond memory comprising two or more non-volatile memory devices, afirst one of the two or more non-volatile memory devices coupled to thefirst channel interface, and a second one of the two or morenon-volatile memory devices coupled to the second channel interface; andwherein the memory module is adapted to physically and electricallycouple to a computer system having a main memory, receive and store datafrom the computer system, and retrieve and transmit data to the computersystem; wherein the controller is operable to receive and store memorypartition configuration information of the memory module; wherein thecontroller is operable to determine whether a volatile memory is coupledto the first memory controller; wherein the alignment buffer is operableto receive and store read data from each of the first and the secondnon-volatile memory devices, which read data arrives at the alignmentbuffer responsive to parallel read requests from the first and secondchannel interfaces of the second memory controller block; and whereinthe read data from each of the first and the second non-volatile memorydevices arrives at a different time.
 11. The memory module of claim 10,further comprising a volatile memory, the volatile memory coupled to thefirst memory controller block.
 12. The memory module of claim 11,wherein the volatile memory comprises DRAM and the non-volatile memorycomprises Flash.
 13. A memory module; comprising: a first memorycomprising a volatile memory; a second memory comprising two or morenon-volatile memory devices; and a controller coupled to the firstmemory and the second memory, the controller comprising: a commandinterpreter; one or more bus interface controller blocks coupled to thecommand interpreter, each of the one or more bus interface controllerblocks further coupled to a corresponding one of one or more businterfaces; a first memory controller block, coupled to the commandinterpreter, for communicating with the first memory; and a secondmemory controller block, coupled to the command interpreter, forcommunicating with the second memory, such that a first one of the twoor more non-volatile memory devices is coupled to a first channelinterface of the second memory controller block and a second one of thetwo or more non-volatile memory devices is coupled to a second channelinterface of the second memory controller block; at least oneconfiguration register, the at least one configuration register coupledto the command interpreter; wherein the memory module is adapted tophysically and electrically couple to a computer system having a mainmemory, receive and store data from the computer system, and retrieveand transmit data to the computer system; wherein neither the firstmemory or the second memory form part of the main memory; wherein thecommand interpreter receives commands from the computer system; whereinthe controller is operable to receive and store memory partitionconfiguration information of the memory module; wherein the secondmemory controller block includes at least one alignment buffer toreceive and store read data from each of the first and the secondnon-volatile memory devices, which read data arrives at the alignmentbuffer responsive to parallel read requests from the second memorycontroller block; and wherein the read data from each of the first andthe second non-volatile memory devices arrives at a different time.